AMENDMENT TO THE CLAIMS: 

This listing of claims will replace all prior versions, and listings, of clainns in the 
application: 

1 . (Previously presented) A speech processing apparatus comprising: 

generation means for generating a pseudo acoustic echo signal for each 
sample based on a current impulse response simulating an acoustic echo transfer path 
and on a source signal; 

supply means for holding the current impulse response for each sample 
and supplying the current impulse response to said generation means; 

elimination means for subtracting said pseudo acoustic echo signal from a 
near-end speech signal to remove an acoustic echo component and thereby generate 
an acoustic echo-canceled signal for each sample; 

update means for continually updating the impulse response for each 
sample by using said source signal, said acoustic echo-canceled signal and the current 
impulse response held by said supply means and for supplying the updated impulse 
response to said supply means; 

decision means for checking, in each frame, whether or not a voice is 
included in the near-end speech signal, by using time domain information and frequency 
domain information of said acoustic echo-canceled signal; 
^ L ^ storage means for storing one or more impulse responses in each frame; 



and 

control means for, in a frame for which the result of decision made by said 
decision means is negative, storing in said storage means the current Impulse response 
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held by said supply means and, in a frame for which the result of the decision is 
positive, retrieving one of the impulse responses stored in said storage means and 
supplying it to said supply means. 

2. (Original) A speech processing apparatus as claimed in claim 1 , 
wherein said acoustic echo-canceled signal is used for speech recognition. 

3. (Original) A speech processing apparatus as claimed in claim 2, 
further comprising: 

means for determining a spectrum for each frame by performing the 
Fourier transform on said acoustic echo-canceled signal; 

means for successively determining a spectrum mean for each frame 
based on the spectrum obtained; and 

means for successively subtracting the spectrum mean from the spectrum 
calculated for each frame from said acoustic echo-canceled signal to remove additive 
noise of an unknown source. 

4. (Currently amended) A speech processing apparatus as claimed in 
claim 2, further comprising: 

means for determining a spectrum for each frame by performing the 
Fourier transform on said acoustic echo-canceled signal; 

means for successively determining a spectrum mean for each frame 
based on the spectrum obtained; 

means for successively subtracting the spectrum mean from the spectrum 
calculated for each frame from said acoustic echo-canceled signal; 
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means for determining a cepstrum from the spectrum, the spectrum being 
removed of the additive noise of an unknown source by said subtraction means; 

means for determining for each talker a cepstrum mean of a speech frame 
and a cepstrum mean of a non-speech frame, separately, from the cepstrums obtained; 
and 

means for subtracting the cepstrum mean of the speech frame of each 
talker from the cepstrum of the speech frame of the talker and for subtracting the 
cepstrum mean of the non-speech frame of each talker from the cepstrum of the non- 
speech frame of the talker to correct in a lump multiplicative distortions that are 
dependent on microphone characteristics and spatial transfer characteristics from the 
mouth of the talker to the microphone , wherein said means for subtracting comprises 
first subtracting means for subtracting the cepstrum mean of the speech frame of each 
talker from the cepstrum of the speech frame of each talker and second means for 
subtracting the cepstrum mean of the non-speech frame of the talker and by said first 
subtracting means and said second subtracting means, said subtracting means corrects 
in a lump multiplicative distortions that are dependent on a microphone characteristics 
and spatial transfer characteristics from the mouth of the talker to the microphone . 

5. (Original) A speech processing apparatus as claimed in claim 2, 
further comprising: 

means for determining a spectrum for each frame by performing the 
Fourier transform on said acoustic echo-canceled signal; 
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means for determining a cepstrum from the spectrum obtained; means for 
determining for each talker a cepstrum mean of a speech frame and a cepstrum mean 
of a non-speech frame, separately, from the cepstrums obtained; and 

means for subtracting the cepstrum mean of the speech frame of each 
talker from the cepstrum of the speech frame of the talker and for subtracting the 
cepstrum mean of the non-speech frame of each talker from the cepstrum of the non- 
speech frame of the talker to correct multiplicative distortions that are dependent on 
microphone characteristics and spatial transfer characteristics from the mouth of the 
talker to the microphone. 

6. (Original) A speech processing apparatus comprising: 

means for determining a spectrum for each frame by the Fourier 

transform; 

means for determining a cepstrum from the spectrum obtained; 

means for determining for each talker a cepstrum mean of a speech frame 
and a cepstrum mean of a non-speech frame, separately, from the cepstrums obtained; 
and 

means for subtracting the cepstrum mean of the speech frame of each 
talker from the cepstrum of the speech frame of the talker and for subtracting the 
cepstrum mean of the non-speech frame of each talker from the cepstrum of the non- 
speech frame of the talker to correct multiplicative distortions that are dependent on 
microphone characteristics and spatial transfer characteristics from the mouth of the 
talker to the microphone. 



-5- 




7. (Previously presented) A speech processing method comprising: 

a generation step for generating a pseudo acoustic echo signal for each 
sample based on a current impulse response simulating an acoustic echo transfer path 
and on a source signal; 

a supply step for holding the current impulse response for each sample 
and supplying the current impulse response to said generation step; 

an elimination step for subtracting said pseudo acoustic echo signal from 
a near-end speech signal tcj remove an acoustic echo component and thereby generate 
an acoustic echo-canceled signal for each sample; 

an update step for continually updating the impulse response for each 
sample by using said source signal, said acoustic echo-canceled signal and the current 
impulse response held by the supply step and for supplying the updated impulse 
response to said supply step; 

a decision step for checking, in each frame, whether or not a voice is 
included in the near-end speech signal, by using time domain information and frequency 
domain information of said acoustic echo-canceled signal; 

a storage step for storing one or more impulse responses in each frame; 

and 

a control step for, in a frame for which the result of decision made by said 
decision step is negative, storing in said storage step the current impulse response held 
by the supply step and, in a frame for which the result of decision is positive, retrieving 
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one of the impulse responses stored in said storage step and supplying it to said supply 
step. 

8. (Original) A speech processing method as claimed in claim 7, wherein 
said acoustic echo-canceled signal is used for speech recognition. 

9. (Original) A speech processing method as claimed in claim 8, further 
comprising: 

a step for determining a spectrum for each frame by performing the 
Fourier transform on said acoustic echo-canceled signal; 

a step for successively determining a spectrum mean for each frame 
based on the spectrum obtained; and a step for successively subtracting the spectrum 
mean from the spectrum calculated for each frame from said acoustic echo-canceled 
signal to remove additive noise of an unknown source. 

1 0. (Original) A speech processing method as claimed in claim 8, further 
comprising: 

a step for determining a spectrum for each frame by performing the 
Fourier transform on said acoustic echo-canceled signal; 

a step for successively determining a spectrum mean for each frame 
based on the spectrum obtained; 
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a step for successively subtracting the spectrum mean from the spectrum 
calculated for each frame from said acoustic echo-canceled signal to remove additive 
noise of an unknown source; 

a step for determining a cepstrum from the spectrum removed of the 
additive noise; 

a step for determining for each talker a cepstrum mean of a speech frame 
and a cepstrum mean of a non-speech frame, separately, from the cepstrums obtained; 
and 

a step for subtracting the cepstrum mean of the speech frame of each 
talker from the cepstrum of the speech frame of the talker and for subtracting the 
cepstrum mean of the non-speech frame of each talker from the cepstrum of the non- 
speech frame of the talker to correct multiplicative distortions that are dependent on 
microphone characteristics and spatial transfer characteristics from the mouth of the 
talker to the microphone. 

1 1 . (Original) A speech processing method as claimed in claim 8, further 
comprising: 

a step for determining a spectrum for each frame by performing the 
Fourier transform on said acoustic echo-canceled signal; 

a step for determining a cepstrum from the spectrum obtained; a step for 
determining for each talker a cepstrum mean of a speech frame and a cepstrum mean 
of a non-speech frame, separately, from the cepstrums obtained; and 
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a step for subtracting the cepstrum mean of the speech frame of each 
talker from the cepstrum of the speech frame of the talker and for subtracting the 
cepstrum mean of the non-speech frame of each talker from the cepstrum of the non- 
speech frame of the talker to correct multiplicative distortions that are dependent on 
microphone characteristics and spatial transfer characteristics from the mouth of the 
talker to the microphone. 

12. (Original) A speech processing method comprising: 

a step for determining a spectrum for each frame by the Fourier transform; 

a step for determining a cepstrum from the spectrum obtained; 

a step for determining for each talker a cepstrum mean of a speech frame 

and a cepstrum mean of a non-speech frame, separately, from the cepstrums obtained; 

and 

a step for subtracting the cepstrum mean of the speech frame of each 
talker from the cepstrum of the speech frame of the talker and for subtracting the 
cepstrum mean of the non-speech frame of each talker from the cepstrum of the non- 
speech frame of the talker to correct multiplicative distortions that are dependent on 
microphone characteristics and spatial transfer characteristics from the mouth of the 
talker to the microphone. 
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